CS 8321 Lab 1

Francesco Trozzi (47779944) - George Sammit (04010135) - Megan Simons (46334773)

Part 1: Select a CNN

In [3]:
# Works with Python 3.5 & Keras 2.0.8
import sys
print(sys.version)
import keras
print(keras.__version__)

import warnings
warnings.filterwarnings('ignore')
3.5.6 |Anaconda, Inc.| (default, Aug 26 2018, 16:30:03) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
Using TensorFlow backend.
2.0.8

Image Processing Utilities

In [4]:
from PIL import Image
from keras.preprocessing import image
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.patches as patches
In [5]:
def load_image(local_file):
    return image.load_img(local_file, target_size=(224, 224)) # 224x224 to match VGG
In [6]:
def show_image(img, title=None, cmap=None):
    if cmap is None:
        plt.imshow(img)
    else:
        plt.imshow(img, cmap=cmap)
    plt.title(title)
    plt.show()
In [7]:
import numpy as np

def preprocess_image_for_VGG(sample):
    """Setup an image for input into VGG"""
    processed = np.array(sample, dtype=float)
    processed = np.expand_dims(processed, axis=0)
    processed = preprocess_input(processed)
    return processed

Selected CNN -- VGG

We chose VGG as our CNN because it is included in Keras, was trained on ImageNet as recommended, has high accuracy, and is fairly straightforward. The architecture of VGG16 is depicted below (image source: https://neurohive.io/en/popular-networks/vgg16/). This visual explaination is far better than we could provide in words. However, to summarize a 224x224 color image is input. It goes thrpough a series of five seperate convolutions (2-3 convolutions in series) followed by max-pooling. It then progresses through a flattening and two fully connected layers before a softmax-based classification.

We originally chose VGG16, but later moved to VGG19 because OpenAI Microscope does not support the earlier model. Unfortunately, we coulen't find as nice of an illustration of VGG19, so we decided to show the difference which is the addition of convolutions starting at layer 4 (image source: http://datahacker.rs/deep-learning-vgg-16-vs-vgg-19/)

Note: It was interesting and a bit surprising to see the effect that these layers had on the earlier layers that we chose in terms of activations.

vgg16-1-e1542731207177%5B1%5D.png vgg-ispravljeno--718x1024.png

In [8]:
from keras.applications import VGG19

# Load the pre-trained VGG19 model
vgg_model = VGG19(weights='imagenet')
vgg_model.summary()
WARNING:tensorflow:From /Users/megansimons/opt/anaconda3/envs/neuralnetworks/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py:1205: calling reduce_prod (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv4 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv4 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv4 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 143,667,240
Trainable params: 143,667,240
Non-trainable params: 0
_________________________________________________________________

Check some images

In [9]:
from keras.applications.vgg19 import preprocess_input, decode_predictions

def predict(model, sample):
    """Given a sample image, return a prediction"""
    prediction = model.predict(preprocess_image_for_VGG(sample))
    print('Predicted:', decode_predictions(prediction, top=5)[0])

VGG has no doubt that this is indeed a pizza

In [11]:
PIZZA = load_image('pizza-2.png')
show_image(PIZZA, "Pizza")
In [12]:
predict(vgg_model, PIZZA)
Predicted: [('n07873807', 'pizza', 0.9998196), ('n07693725', 'bagel', 2.8598877e-05), ('n04263257', 'soup_bowl', 2.3215804e-05), ('n07613480', 'trifle', 2.2443983e-05), ('n03400231', 'frying_pan', 2.1630263e-05)]

Fire is not an ImageNet class, and volcano seems logical, not so sure about the others...

In [13]:
FIRE = load_image('fire.jpg')
show_image(FIRE, "Fire")
In [14]:
predict(vgg_model, FIRE)
Predicted: [('n01981276', 'king_crab', 0.6611748), ('n01910747', 'jellyfish', 0.035846423), ('n02321529', 'sea_cucumber', 0.032980043), ('n09472597', 'volcano', 0.029177697), ('n10565667', 'scuba_diver', 0.01622314)]

The quintessential dog. There are many classes of dogs in ImageNet...

In [15]:
DOG = load_image('dog.jpg')
show_image(DOG, "Dog")
In [16]:
predict(vgg_model, DOG)
Predicted: [('n02099601', 'golden_retriever', 0.99595535), ('n04409515', 'tennis_ball', 0.0029503256), ('n02100877', 'Irish_setter', 0.0002797395), ('n02090379', 'redbone', 0.00020634016), ('n02099712', 'Labrador_retriever', 0.00017253029)]

Now our tricky one... It found the dog...

In [17]:
DOG_FIRE_PIZZA = load_image('dog_fire_pizza.png')
show_image(DOG_FIRE_PIZZA, "Dog, Fire, Pizza")
In [18]:
predict(vgg_model, DOG_FIRE_PIZZA)
Predicted: [('n02111889', 'Samoyed', 0.9577753), ('n02112018', 'Pomeranian', 0.013005534), ('n02111500', 'Great_Pyrenees', 0.009890466), ('n02112137', 'chow', 0.0071743694), ('n02120079', 'Arctic_fox', 0.003444356)]

Filter Selection & Visualization (Dog, Fire, Pizza Image)

For this section we employed some pre-built functions for visualizing layers activations and generate patterns that maximally excite a filter. To show our understanding of the inner-workings of these codes, we have heavily commented them. To separate our comments from the ones of the original author, our codes are preceeded by three # (###).

In [19]:
images_per_row = 16

def show_layer(layer_name, layer_activation):
    # This is the number of features in the feature map
    n_features = layer_activation.shape[-1] ### Actually Number of filters
    
    # The feature map has shape (1, size, size, n_features)
    size = layer_activation.shape[1] ### number of rows/columns in the conv kernel

    # We will tile the activation channels in this matrix
    n_cols = n_features // images_per_row  
    display_grid = np.zeros((size * n_cols, images_per_row * size)) ### define size of dysplay grid
    
    # We'll tile each filter into this big horizontal grid
    for col in range(n_cols):
        for row in range(images_per_row):
            ### we get the activation output for:
                ### 0 as we have only one image
                ### all rows and columns of the selected kernel
                ### the last part selects the specific kernels that will show into the corresponding row and column 
            channel_image = layer_activation[0,:, :,col * images_per_row + row]
            # Post-process the feature to make it visually palatable
            ### Normalize activation image by subtractng the meand and dividing by standard deviation 
            channel_image -= channel_image.mean()
            channel_image /= channel_image.std()
            ### multiply by 64 and add 128
            channel_image *= 64 
            channel_image += 128
            ###clips values from 0 to 255 (the uint8 range)
            channel_image = np.clip(channel_image, 0, 255).astype('uint8')
            ###allocate activation image in the grid
            display_grid[col * size : (col + 1) * size,
                         row * size : (row + 1) * size] = channel_image

    # Display the grid
    scale = 1. / size
    plt.figure(figsize=(scale * display_grid.shape[1], scale * display_grid.shape[0]))
    plt.title(layer_name)
    plt.grid(False)
    plt.imshow(display_grid, aspect='auto', cmap='viridis')
    plt.show()
In [20]:
from keras import models

# Extracts the outputs of the top few layers:
layer_outputs = [layer.output for layer in vgg_model.layers[:10]]

# Creates a model that will return these outputs, given the model input:
activation_model = models.Model(inputs=vgg_model.input, outputs=layer_outputs)

print(activation_model.summary())
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
=================================================================
Total params: 1,735,488
Trainable params: 1,735,488
Non-trainable params: 0
_________________________________________________________________
None
In [21]:
# This will return a list of 3 arrays, one array per layer activation
img_tensor = preprocess_image_for_VGG(DOG_FIRE_PIZZA)
print(img_tensor.shape)
activations = activation_model.predict(img_tensor)
first_layer_activation = activations[0]
print(first_layer_activation.shape)
(1, 224, 224, 3)
(1, 224, 224, 3)
In [22]:
second_layer_activation = activations[1]
print(second_layer_activation.shape)
(1, 224, 224, 64)
In [23]:
# These are the names of the layers, so can have them as part of our plot
layer_names = []
for layer in vgg_model.layers[:10]:
    layer_names.append(layer.name)

images_per_row = 16

# Now let's display our feature maps. Note that we skip the input layer (layer[0])
for layer_name, layer_activation in zip(layer_names[1:2], activations[1:2]):
    print('layer_activation shape:')
    print(layer_activation.shape)
    show_layer(layer_name, layer_activation)

plt.show()
layer_activation shape:
(1, 224, 224, 64)

Analyze a Filter (block3_conv2, filter[8])

In [24]:
from keras.applications import VGG19
from keras import backend as K

# Load the pre-trained VGG19 model
model = VGG19(weights='imagenet', include_top=False)

# Selecting a layer and channel to visualize
layer_name = 'block3_conv1'
filter_index = 0
 
# Isolate the output and loss for the given chanel
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:, :, :, filter_index])

# We take the gradient of this loss using keras backend.gradients
grads = K.gradients(loss, model.input)[0]

# Before performing gradient descent, we divide the gradient tensor by its L2 norm (square root
# of the mean of the square of values in the tensor). We add a small epsilon term to the L2 norm
# to avoid division by zero.
grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5)

# We use a keras backend function to accept a numpy tensor and return a loss and gradient for that tensor.
iterate = K.function([model.input], [loss, grads])

# To quickly test the interface:
loss_value, grads_value = iterate([np.zeros((1, 150, 150, 3))])
WARNING:tensorflow:From /Users/megansimons/opt/anaconda3/envs/neuralnetworks/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py:1290: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead

Select a filter (i.e., a feature) in a layer in which to analyze as part of a circuit. This should be a filter in a "mid-level" portion of the network (that is, there are a few convolutional layers before and after this chosen layer).

In [25]:
# Print what the top predicted class is
print(model.summary())
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv4 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv4 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv4 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, None, None, 512)   0         
=================================================================
Total params: 20,024,384
Trainable params: 20,024,384
Non-trainable params: 0
_________________________________________________________________
None
In [26]:
def deprocess_image(x):
    # normalize tensor: center on 0., ensure std is 0.1
    x -= x.mean()
    x /= (x.std() + 1e-5)
    x *= 0.1

    # clip to [0, 1]
    x += 0.5
    x = np.clip(x, 0, 1)

    # convert to RGB array
    x *= 255
    x = np.clip(x, 0, 255).astype('uint8')
    return x
In [27]:
def generate_pattern(layer_name, filter_index, size=150):
    # Build a loss function that maximizes the activation
    # of the nth filter of the layer considered.
    layer_output = model.get_layer(layer_name).output ### This give the relu outputs for the ALL the filters of this layer
    loss = K.mean(layer_output[:, :, :, filter_index]) ### get the mean value of the activation out for the specific filter

    # Compute the gradient of the input picture wrt this loss
    # add this operation to the computation graph
    grads = K.gradients(loss, model.input)[0] ### get the derivative of the input layer with respct to the loss 

    # Normalization trick: we normalize the gradient
    grads /= (K.sqrt(K.mean(K.square(grads))) + 1e-5) ### L2 normalization - to note i reckon the 1e-5 is to avoid diving by zero

    # This function returns the loss and grads given the input picture
    # get back the computation graph operations to run
    iterate = K.function([model.input], [loss, grads]) ### creates a function that has for input the model input (image) and return the loss and the gradients
    
    # We start from a gray image with some uniform noise
    input_img_data = np.random.random((1, size, size, 3)) * 20 + 128. ### np random create an image-like random array 

    # Run gradient ascent for 40 steps
    step = 1.
    for i in range(40):
        loss_value, grads_value = iterate([input_img_data]) ### keep getting loss and gradients 
        input_img_data += grads_value * step ### add the gradients of the activation to create the image

    img = input_img_data[0] # it is just one image therefore the first dimension prints the image
    return deprocess_image(img)
In [28]:
filter_img = generate_pattern('block3_conv2', 2, size=64)
In [29]:
for layer_name in ['block3_conv2']:
    size = 64
    margin = 5

    # This a empty (black) image where we will store our results.
    results = np.zeros((8 * size + 7 * margin, 8 * size + 7 * margin, 3))

    for i in range(8):  # iterate over the rows of our results grid
        for j in range(8):  # iterate over the columns of our results grid
            # Generate the pattern for filter `i + (j * 8)` in `layer_name`
            filter_img = generate_pattern(layer_name, i + (j * 8), size=size)
            # Put the result in the square `(i, j)` of the results grid
            horizontal_start = i * size + i * margin
            horizontal_end = horizontal_start + size
            vertical_start = j * size + j * margin
            vertical_end = vertical_start + size
            results[horizontal_start: horizontal_end, vertical_start: vertical_end, :] = filter_img

    # Display the results grid
    plt.figure(figsize=(20, 20))
    plt.imshow((results * 255).astype(np.uint8))
    plt.show()
In [30]:
block3_conv2_filter8_exciter = generate_pattern('block3_conv2', 8)
show_image(block3_conv2_filter8_exciter)
In [31]:
# Save the image
im = Image.fromarray(block3_conv2_filter8_exciter)
im.save("block3_conv2_filter8_exciter.jpeg")
Grab the activations
In [32]:
from keras import models

# Extracts the outputs of the top 8 layers:
layer_outputs = [layer.output for layer in model.layers[:]]
# Creates a model that will return these outputs, given the model input:
activation_model = models.Model(inputs=model.input, outputs=layer_outputs)
Start with the image above. That seems pretty excitatory
In [33]:
EXCITER = load_image('block3_conv2_filter8_exciter.jpeg')
img_tensor = preprocess_image_for_VGG(EXCITER)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(EXCITER, "Exciter Image")
show_image(selected_layer_activation[0, :, :, 8], title='Activation vizualization from above', cmap='viridis')
Our selected image. We found a marinara detector!
In [34]:
img_tensor = preprocess_image_for_VGG(DOG_FIRE_PIZZA)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(DOG_FIRE_PIZZA, "Our selected image")
show_image(selected_layer_activation[0, :, :, 8], title='Activation vizualization from above', cmap='viridis')
Discussion

We hypothesize that this filter detects changes in color brightness of high frequency patterns. From the chose image activation above, we show that that changes contrast of small detail is picked: the burnt spots in the pizza crust, all the small color changes in the sauce due to garlic and oregano, the eyes, ears, and the nose of the dog, and the charcoal of the fire. To conclude, we believe, in layman terms, that we identified a spot detector.

Part 3: Analyze the incoming filters

Just look at the layer make-up
In [35]:
# Above, we looked at block3_conv2 (56, 56, 256), channel 8.
block3_conv2 = vgg_model.layers[8]
print("name: {}".format(block3_conv2.name))
print("filters: {}".format(block3_conv2.filters))
print("kernel size: {}".format(block3_conv2.kernel_size))
print("strides: {}".format(block3_conv2.strides))
print("padding: {}".format(block3_conv2.padding))
name: block3_conv2
filters: 256
kernel size: (3, 3)
strides: (1, 1)
padding: same

A simple approach

Just look at the weights of the current layer. Luckily (thanks Francesco for the good pick), the layer being inverstigated and the previous layer have the same number of output channels.

In [45]:
# Just look at the weights in the current layer
block3_conv2_weights = block3_conv2.get_weights()[0] # 0 is the weights, 1 is the bias
block3_conv2_filter8_weights = block3_conv2_weights[:, :, :, 8]
print(block3_conv2_filter8_weights.shape)
(3, 3, 256)
In [46]:
# L2 normalization, now we have a list
channels = block3_conv2_filter8_weights.shape[2]
l2_norms = [np.linalg.norm(block3_conv2_filter8_weights[:, :, channel], ord='fro') for channel in range(channels)]
l2_norms = np.array(l2_norms)
print("shape: {}".format(l2_norms.shape))
print("Min: {} Max: {}".format(np.min(l2_norms), np.max(l2_norms)))
shape: (256,)
Min: 0.004066952038556337 Max: 0.22880826890468597
In [50]:
# Get the indexes of the top 6.  Note that sorting is in ascending order,
# hence taking from the end
sorted = np.sort(l2_norms)[-6:]

# Duplicates are highly unlikely
top_indexes = np.concatenate([np.where(l2_norms==sorted[index])[0] for index in range(len(sorted))])
# Make them highest to lowest
top_indexes = np.flip(top_indexes)
print("Indexs of top six filers: {}".format(top_indexes))
print()
_ = [print("index: {} value: {}".format(index, l2_norms[index])) for index in top_indexes]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-50-ba3f742b119f> in <module>()
      6 top_indexes = np.concatenate([np.where(l2_norms==sorted[index])[0] for index in range(len(sorted))])
      7 # Make them highest to lowest
----> 8 top_indexes = np.flip(top_indexes)
      9 print("Indexs of top six filers: {}".format(top_indexes))
     10 print()

TypeError: flip() missing 1 required positional argument: 'axis'

For some reason the code above did not want to work in this final notebook. I played around with the code, but I could not get it to work without the error coming up. Attached below is a screenshot of the code working previous times so you know it works (of course it works most of the time, but not when I need it to). --Megan

Top6.png

A more thourough approach

Here, the previous layer is checked, and the code accounts for differences in the size of the feature maps.

Article: https://distill.pub/2020/circuits/visualizing-weights/ Code: https://colab.research.google.com/drive/19cmX6U70zovssqIbAJaWFUEWNu4AIZBF?usp=sharing#scrollTo=-vL6Uycfjsk0

The aforementioned code is in a Google Colab notebook which isn't a big deal. However, it uses lucid which we could not install because we had to use Python 3.5 to resolve some other dependencies. Therefore, we simply updated the code to use VGG and our selected filter, and obtained the same results. Screenshots of the other notebook are presented below

The meat of this code is the get_expanded_weights method which accounts for non-adjacent interactions. They take the gradients through our model and ignoring/replace non-linear operations with the closest linear one. It breaks into the computation graph which is a different approach than simply pulling out the weights like we did.

In [83]:
# Directly from https://colab.research.google.com/drive/19cmX6U70zovssqIbAJaWFUEWNu4AIZBF?usp=sharing#scrollTo=-vL6Uycfjsk0

# Shown here as code for reference.  This was not executed as part of this notebook.

def ForceAvgPoolGrad(op, grad):
    inp = op.inputs[0]

    op_args = [op.get_attr("ksize"), op.get_attr("strides"), op.get_attr("padding")]
    smooth_out = tf.nn.avg_pool(inp, *op_args)
    inp_smooth_grad = tf.gradients(smooth_out, [inp], grad)[0]

    return inp_smooth_grad

def MaxAsAvgPoolGrad(op, grad):
    inp = op.inputs[0]

    op_args = [op.get_attr("ksize"), op.get_attr("strides"), op.get_attr("padding")]
    smooth_out = tf.nn.avg_pool(inp, *op_args)
    inp_smooth_grad = tf.gradients(smooth_out, [inp], grad)[0]

    return inp_smooth_grad

@functools.lru_cache(128)
def get_expanded_weights(model, layer1, layer2, W=5):
  # Set up a graph for doing attribution...
  with tf.Graph().as_default(), tf.Session(), gradient_override_map({"Relu": lambda op, grad: grad, "MaxPool": MaxAsAvgPoolGrad}):
    t_input = tf.placeholder_with_default(tf.zeros([1,224,224,3]), [None,None, None, 3])
    T = render.import_model(model, t_input, t_input)

    # Compute activations; this gives us numpy arrrays with the right number of channels
    acts1 = T(layer1).eval()
    acts2 = T(layer2).eval()

    # Compute gradient from center; due to overrides this just multiplies out the weights
    t_offset = (tf.shape(T(layer2))[1]-1)//2
    t_center = T(layer2)[0, t_offset, t_offset]
    n_chan2 = tf.placeholder("int32", [])
    t_grad = tf.gradients(t_center[n_chan2], [T(layer1)])[0]
    arr = np.stack([t_grad.eval({n_chan2: i, T(layer1): acts1[:,0:W,0:W]})[0] for i in range(acts2.shape[-1])], -1)

    return arr

We changed the model to be VGG (this one is from model zoo) model19.JPG

Now, we get the weights and identify the top-six indexes. Not that the default for the normilization is None which equates to the Frobenius norm which we specified. Also note that the layer names are slightly different in this model. Finally, we had to adjust for a 3x3 filter as opposed to the 5x5 default.

Same results!

indexes19.JPG

Vizualization

In [41]:
# Normalize the Filter weights (from https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/)
filter8_min, filter8_max = block3_conv2_filter8_weights.min(), block3_conv2_filter8_weights.max()
block3_conv2_filter8_weights = (block3_conv2_filter8_weights - filter8_min) / (filter8_max - filter8_min)
print("Minimum: {}".format(block3_conv2_filter8_weights.min()))
print("Maximum: {}".format(block3_conv2_filter8_weights.max()))
Minimum: 0.0
Maximum: 1.0
In [42]:
def show_filter(colormap):
    # Vizualization (from https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/)
    for i in range(len(top_indexes)):
        f = block3_conv2_filter8_weights[:, :, top_indexes[i]]
        ax= plt.subplot(3, 3, i+1)
        ax.set_xticks([])
        ax.set_yticks([])
        plt.title("Index: " + str(int(top_indexes[i])))
        plt.imshow(f[:, :],interpolation="nearest",cmap=colormap)
    plt.tight_layout()
    plt.show()
Colormap

The dark squares indicate small or inhibitory weights and the light squares represent large or excitatory weights.

In [51]:
show_filter("Blues")
The previous layer activations

All images below are from OpenAI Microscope (https://microscope.openai.com/models/vgg19_caffe/)

In [52]:
show_image(load_image('channel-48.png'), "3A-Index 48")
show_image(load_image('channel-32.png'), "3A-Index 32")
show_image(load_image('channel-93.png'), "3A-Index 93")
show_image(load_image('channel-182.png'), "3A-Index 182")
show_image(load_image('channel-139.png'), "3A-Index 139")
show_image(load_image('channel-203.png'), "3A-Index 20")
This layer activation

Also from OpenAI Microscope

In [53]:
show_image(load_image('channel-8.png'), "3B-Index 8")

Discussion

The images above tend to show a circuit where primarily circular shapes with high contrast are prominent on the edges are excitatory. This follows the analysis above. It is fairly clear in the visualization of the filters where we see the excitation in an l-like shape moving from lower left to upper right. That said, these filters are fairly high in the architecture and are capturing abstract concepts. In general, this is logical, but it is difficult to say with certainty.

Part 4: Using image gradient techniques to visualize what each of the six strongest filters is most excited by

Below, we run the six strongest filters on our DOG_FIRE_PIZZA image. We also look at the colormaps for each filter to see how that changes the excitatory or inhibitory weights. Based on what the filter picks up in each image, we can analyze what it gets excited by.

Block 3, Conv1, Index48

In [54]:
block3_conv1_filter48_exciter = generate_pattern('block3_conv1', 48)
In [55]:
# Save the image
im_48 = Image.fromarray(block3_conv1_filter48_exciter)
im_48.save("block3_conv1_filter48_exciter.jpeg")
In [56]:
EXCITER_48 = load_image('block3_conv1_filter48_exciter.jpeg')
img_tensor_48 = preprocess_image_for_VGG(EXCITER_48)
activations_48 = activation_model.predict(img_tensor_48)
selected_layer_activation_48 = activations_48[8]

show_image(EXCITER_48, "Exciter Image")
show_image(selected_layer_activation_48[0, :, :, 48], title='Activation vizualization from above', cmap='viridis')
In [57]:
img_tensor_48 = preprocess_image_for_VGG(DOG_FIRE_PIZZA)
activations_48 = activation_model.predict(img_tensor)
selected_layer_activation_48 = activations[8]

show_image(DOG_FIRE_PIZZA, "Our selected image")
show_image(selected_layer_activation_48[0, :, :, 48], title='Activation vizualization from above', cmap='viridis')

For the DOG_FIRE_PIZZA image this filter is excited by shadows created by a contrast between difference in coloration. The shadow appearing between the pizza and the table, the shadow appearing between the dog's right ear and the fireplace, and the shadow appearing between the window sill and the wood in the background are all excited in the filtered image.

Block 3, Conv1, Index32

In [58]:
block3_conv1_filter32_exciter = generate_pattern('block3_conv1', 32)
In [59]:
# Save the image
im_32 = Image.fromarray(block3_conv1_filter32_exciter)
im_32.save("block3_conv1_filter32_exciter.jpeg")
In [60]:
EXCITER_32 = load_image('block3_conv1_filter32_exciter.jpeg')
img_tensor_32 = preprocess_image_for_VGG(EXCITER_32)
activations_32 = activation_model.predict(img_tensor_32)
selected_layer_activation_32 = activations_32[8]

show_image(EXCITER_32, "Exciter Image")
show_image(selected_layer_activation_32[0, :, :, 32], title='Activation vizualization from above', cmap='viridis')
In [61]:
img_tensor_32 = preprocess_image_for_VGG(DOG_FIRE_PIZZA)
activations_32 = activation_model.predict(img_tensor)
selected_layer_activation_32 = activations_32[8]

show_image(DOG_FIRE_PIZZA, "Our selected image")
show_image(selected_layer_activation_32[0, :, :, 32], title='Activation vizualization from above', cmap='viridis')

For the DOG_FIRE_PIZZA image this filter is excited by dark colors. The burnt crust on the pizza, the dog's ears nose, eyes, and paws, and the outline of the fireplace all create excitation in the filtered image.

Block 3, Conv1, Index93

In [62]:
block3_conv1_filter93_exciter = generate_pattern('block3_conv1', 93)
In [63]:
# Save the image
im_93 = Image.fromarray(block3_conv1_filter93_exciter)
im_93.save("block3_conv1_filter93_exciter.jpeg")
In [64]:
EXCITER_93 = load_image('block3_conv1_filter93_exciter.jpeg')
img_tensor_93 = preprocess_image_for_VGG(EXCITER_93)
activations_93 = activation_model.predict(img_tensor_93)
selected_layer_activation_93 = activations_93[8]

show_image(EXCITER_93, "Exciter Image")
show_image(selected_layer_activation_93[0, :, :, 93], title='Activation vizualization from above', cmap='viridis')
In [65]:
img_tensor_93 = preprocess_image_for_VGG(DOG_FIRE_PIZZA)
activations_93 = activation_model.predict(img_tensor)
selected_layer_activation_93 = activations_93[8]

show_image(DOG_FIRE_PIZZA, "Our selected image")
show_image(selected_layer_activation_93[0, :, :, 93], title='Activation vizualization from above', cmap='viridis')

For the DOG_FIRE_PIZZA image this filter is excited by shadows created by a contrast between difference in dark and light colors. This excitation can be seen in the filtered image at the burnt pieces on the pizza crust where there is a small glimpse of unburnt crust, the outline of the dog's nose, eyes, and mouth, the window sill background, and the area between the dog and the fire.

Block 3, Conv1, Index182

In [66]:
block3_conv1_filter182_exciter = generate_pattern('block3_conv1', 182)
In [67]:
# Save the image
im_182 = Image.fromarray(block3_conv1_filter182_exciter)
im_182.save("block3_conv1_filter182_exciter.jpeg")
In [68]:
EXCITER_182 = load_image('block3_conv1_filter182_exciter.jpeg')
img_tensor_182 = preprocess_image_for_VGG(EXCITER_182)
activations_182 = activation_model.predict(img_tensor_182)
selected_layer_activation_182 = activations_182[8]

show_image(EXCITER_182, "Exciter Image")
show_image(selected_layer_activation_182[0, :, :, 182], title='Activation vizualization from above', cmap='viridis')
In [69]:
img_tensor_182 = preprocess_image_for_VGG(DOG_FIRE_PIZZA)
activations_182 = activation_model.predict(img_tensor)
selected_layer_activation_182 = activations_182[8]

show_image(DOG_FIRE_PIZZA, "Our selected image")
show_image(selected_layer_activation_182[0, :, :, 182], title='Activation vizualization from above', cmap='viridis')

For the DOG_FIRE_PIZZA image this filter is excited by light colors, but not white colors. In this filtered image the excitations can be seen in the dog's ears and the pink part of the dog's nose and mouth, the shadow on the right part of the dog's chin, parts of the pizza, and the edge of the table.

Block 3, Conv1, Index139

In [70]:
block3_conv1_filter139_exciter = generate_pattern('block3_conv1', 139)
In [71]:
# Save the image
im_139 = Image.fromarray(block3_conv1_filter139_exciter)
im_139.save("block3_conv1_filter139_exciter.jpeg")
In [72]:
EXCITER_139 = load_image('block3_conv1_filter139_exciter.jpeg')
img_tensor_139 = preprocess_image_for_VGG(EXCITER_139)
activations_139 = activation_model.predict(img_tensor_139)
selected_layer_activation_139 = activations_139[8]

show_image(EXCITER_139, "Exciter Image")
show_image(selected_layer_activation_139[0, :, :, 139], title='Activation vizualization from above', cmap='viridis')
In [73]:
img_tensor_139 = preprocess_image_for_VGG(DOG_FIRE_PIZZA)
activations_139 = activation_model.predict(img_tensor)
selected_layer_activation_139 = activations_139[8]

show_image(DOG_FIRE_PIZZA, "Our selected image")
show_image(selected_layer_activation_139[0, :, :, 139], title='Activation vizualization from above', cmap='viridis')

For the DOG_FIRE_PIZZA image this filter is excited by stark outlines (this one was difficult to figure out since much of the filtered image is indistinguishable). The filtered image is excited by the outline of the burnt pizza crust (towards the front of the picture) against the table, the outline of the dog's eyes and chin, the outline of the dog ear against the fire, the nature seen outside the window, and the outline of the fireplace between each different change (fire, frame, carpet, etc).

Block 3, Conv1, Index20

In [74]:
block3_conv1_filter20_exciter = generate_pattern('block3_conv1', 20)
In [75]:
# Save the image
im_20 = Image.fromarray(block3_conv1_filter20_exciter)
im_20.save("block3_conv1_filter20_exciter.jpeg")
In [76]:
EXCITER_20 = load_image('block3_conv1_filter20_exciter.jpeg')
img_tensor_20 = preprocess_image_for_VGG(EXCITER_20)
activations_20 = activation_model.predict(img_tensor_20)
selected_layer_activation_20 = activations_20[8]

show_image(EXCITER_20, "Exciter Image")
show_image(selected_layer_activation_20[0, :, :, 20], title='Activation vizualization from above', cmap='viridis')
In [77]:
img_tensor_20 = preprocess_image_for_VGG(DOG_FIRE_PIZZA)
activations_20 = activation_model.predict(img_tensor)
selected_layer_activation_20 = activations_20[8]

show_image(DOG_FIRE_PIZZA, "Our selected image")
show_image(selected_layer_activation_20[0, :, :, 20], title='Activation vizualization from above', cmap='viridis')

This excitation is quite unexpected because the exciter image has the excitations going along the top of the image and down the left side of the image. For the DOG_FIRE_PIZZA image this filter is excited by stark changes from light to dark. There is excitation around the dog's eyes and throughout the fire and the pizza.

Visualization with Six Strongest Input Filters

In [108]:
block3_conv1 = vgg_model.layers[10]
In [109]:
# Just look at the weights in the current layer
block3_conv1_weights = block3_conv1.get_weights()[0] # 0 is the weights, 1 is the bias
block3_conv1_filter48_weights = block3_conv1_weights[:, :, :, 48]
print(block3_conv1_filter48_weights.shape)
(3, 3, 256)
In [110]:
# Normalize the Filter weights (from https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/)
filter48_min, filter48_max = block3_conv1_filter48_weights.min(), block3_conv1_filter48_weights.max()
block3_conv1_filter48_weights = (block3_conv1_filter48_weights - filter48_min) / (filter48_max - filter48_min)
print("Minimum: {}".format(block3_conv1_filter48_weights.min()))
print("Maximum: {}".format(block3_conv1_filter48_weights.max()))
Minimum: 0.0
Maximum: 1.0
In [111]:
def show_filter(colormap):
    # Vizualization (from https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/)
    for i in range(len(top_indexes)):
        f = block3_conv1_filter48_weights[:, :, top_indexes[i]]
        ax= plt.subplot(3, 3, i+1)
        ax.set_xticks([])
        ax.set_yticks([])
        plt.title("Index: " + str(int(top_indexes[i])))
        plt.imshow(f[:, :],interpolation="nearest",cmap=colormap)
    plt.tight_layout()
    plt.show()
In [112]:
show_filter("Blues")

The colormap for each of the six strongest filters is illustrated above. Just like the gray colormap executed previously, the light squares indictate excitatory weights and the dark squares indicate inhibitory weights. The 3 images below were chosen because they result in the largest activations based on the DOG_FIRE_PIZZA image, so these are also the colormaps for the 3 images below.

The colormap for index20 has the excitation indicated on the right side, but the exciter image has the excitation indicated on the left side. The colormap for index93 has the excitation indicated on the left side, but the exciter image has the excitation indicated sporadically throughout the block. It seems as though the excitations expected based on the colormap did not completely line up with what was excited in both the exciter image and the DOG_FIRE_PIZZA image.

Part 4 with different images

These images were chosen because they popped up as "pieces of images from the training dataset that result in the largest activations from the given unit" on OpenAI Microscope when we ran the DOG_FIRE_PIZZA image through VGG19. The three images chosen were a daisy, the Kellogg's K, and a yellow Y. Each image was run using VGG19 and one of the six strongest filters found previously. In the first filter, we see how well VGG can predict what is in the image.

Based on what the filter picks up in each image and the analysis done for each filter in the DOG_FIRE_PIZZA image, we can analyze what it gets excited by. We are seeing which image produces the most excitation for each filter. Additionally, we will see whether this image was excited by the same thing as said for the previous image.

*Note: We realized after the VGG classification and going through the lab that png files with solid backgrounds do not seem to be classified as well as images with "natural" backgrounds.

Index48

Daisy

In [84]:
DAISY = load_image('daisy.jpg')
show_image(DAISY, "Daisy")
In [85]:
predict(vgg_model, DAISY)
Predicted: [('n11939491', 'daisy', 0.9996908), ('n02206856', 'bee', 0.000109171626), ('n02219486', 'ant', 8.342355e-05), ('n03991062', 'pot', 2.7177712e-05), ('n01944390', 'snail', 1.8674871e-05)]
In [86]:
img_tensor = preprocess_image_for_VGG(DAISY)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(DAISY, "Our selected image")
show_image(selected_layer_activation[0, :, :, 48], title='Activation vizualization from above', cmap='viridis')
In [87]:
KELLOGG = load_image('kellogg.png')
show_image(KELLOGG, "Kellogg K")
In [88]:
predict(vgg_model, KELLOGG)
Predicted: [('n03476684', 'hair_slide', 0.074615136), ('n03937543', 'pill_bottle', 0.042715635), ('n04116512', 'rubber_eraser', 0.04236552), ('n03929660', 'pick', 0.035975147), ('n03291819', 'envelope', 0.031806476)]
In [89]:
img_tensor = preprocess_image_for_VGG(KELLOGG)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(KELLOGG, "Our selected image")
show_image(selected_layer_activation[0, :, :, 48], title='Activation vizualization from above', cmap='viridis')

Yellow Y

In [90]:
YELLOW = load_image('yellow.png')
show_image(YELLOW, "Yellow Y")
In [91]:
predict(vgg_model, YELLOW)
Predicted: [('n03196217', 'digital_clock', 0.20518488), ('n04579432', 'whistle', 0.11810768), ('n04131690', 'saltshaker', 0.07835208), ('n02747177', 'ashcan', 0.049804132), ('n04553703', 'washbasin', 0.04663288)]
In [92]:
img_tensor = preprocess_image_for_VGG(YELLOW)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(YELLOW, "Our selected image")
show_image(selected_layer_activation[0, :, :, 48], title='Activation vizualization from above', cmap='viridis')

What was said for the DOG_FIRE_PIZZA image: For the DOG_FIRE_PIZZA image this filter is excited by shadows created by a contrast between difference in coloration.

The image that creates the most excitation for this filter is the Yellow Y. The analysis for what excites this filter stays true. There is excitation around the letter where shadows appear from the color changing from yellow to black, especially on the right side, which is what was seen on the colormap.

Index32

Daisy

In [93]:
img_tensor = preprocess_image_for_VGG(DAISY)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(DAISY, "Our selected image")
show_image(selected_layer_activation[0, :, :, 32], title='Activation vizualization from above', cmap='viridis')
In [94]:
img_tensor = preprocess_image_for_VGG(KELLOGG)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(KELLOGG, "Our selected image")
show_image(selected_layer_activation[0, :, :, 32], title='Activation vizualization from above', cmap='viridis')

Yellow Y

In [95]:
img_tensor = preprocess_image_for_VGG(YELLOW)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(YELLOW, "Our selected image")
show_image(selected_layer_activation[0, :, :, 32], title='Activation vizualization from above', cmap='viridis')

What was said for the DOG_FIRE_PIZZA image: For the DOG_FIRE_PIZZA image this filter is excited by dark colors.

The image that creates the most excitation for this filter is the Daisy. The analysis for what excites this filter stays true. There is excitation around the petals where the dark shadows in the background appear, especially in the middle, which is what was seen on the colormap.

Index93

Daisy

In [96]:
img_tensor = preprocess_image_for_VGG(DAISY)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(DAISY, "Our selected image")
show_image(selected_layer_activation[0, :, :, 93], title='Activation vizualization from above', cmap='viridis')
In [97]:
img_tensor = preprocess_image_for_VGG(KELLOGG)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(KELLOGG, "Our selected image")
show_image(selected_layer_activation[0, :, :, 93], title='Activation vizualization from above', cmap='viridis')

Yellow Y

In [98]:
img_tensor = preprocess_image_for_VGG(YELLOW)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(YELLOW, "Our selected image")
show_image(selected_layer_activation[0, :, :, 93], title='Activation vizualization from above', cmap='viridis')

What was said for the DOG_FIRE_PIZZA image: For the DOG_FIRE_PIZZA image this filter is excited by shadows created by a contrast between difference in dark and light colors.

The image that creates the most excitation for this filter is the Yellow Y. The analysis for what excites this filter stays true. There is excitation around the letter where the color changes from yellow to black, especially on the left side and on the left edge of the image, which is what was seen on the colormap.

Index182

Daisy

In [105]:
img_tensor = preprocess_image_for_VGG(DAISY)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(DAISY, "Our selected image")
show_image(selected_layer_activation[0, :, :, 182], title='Activation vizualization from above', cmap='viridis')
In [104]:
img_tensor = preprocess_image_for_VGG(KELLOGG)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(KELLOGG, "Our selected image")
show_image(selected_layer_activation[0, :, :, 182], title='Activation vizualization from above', cmap='viridis')

Yellow Y

In [99]:
img_tensor = preprocess_image_for_VGG(YELLOW)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(YELLOW, "Our selected image")
show_image(selected_layer_activation[0, :, :, 182], title='Activation vizualization from above', cmap='viridis')

What was said for the DOG_FIRE_PIZZA image: For the DOG_FIRE_PIZZA image this filter is excited by light colors, but not white colors.

The image that creates the most excitation for this filter is the Kellogg's logo. The analysis for what excites this filter stays kind of true. We would have expected the yellows to excite this filter, but they must be too dark. There is excitation around the K where a slight shadow appears making the color not white, especially on the right side, which is what was seen on the colormap. The entire background of the logo is slightly excited and the image has an excitatory border, especially seen on the right side.

Index139

Daisy

In [106]:
img_tensor = preprocess_image_for_VGG(DAISY)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(DAISY, "Our selected image")
show_image(selected_layer_activation[0, :, :, 139], title='Activation vizualization from above', cmap='viridis')
In [103]:
img_tensor = preprocess_image_for_VGG(KELLOGG)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(KELLOGG, "Our selected image")
show_image(selected_layer_activation[0, :, :, 139], title='Activation vizualization from above', cmap='viridis')

Yellow Y

In [100]:
img_tensor = preprocess_image_for_VGG(YELLOW)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(YELLOW, "Our selected image")
show_image(selected_layer_activation[0, :, :, 139], title='Activation vizualization from above', cmap='viridis')

What was said for the DOG_FIRE_PIZZA image: For the DOG_FIRE_PIZZA image this filter is excited by stark outlines (this one was difficult to figure out since much of the filtered image is indistinguishable).

As we can see in the filtered images above, these are also fairly indistinguishable. Every image was blurred to a large degree.

The image that creates the most excitation for this filter is the Yellow Y. The analysis for what excites this filter stays kind of true. We would have expected the outlines of the K in the Kellogg's Logo would be more stark than those of the Y, where they appear to be less defined. There is excitation around the letter where the color changes from yellow to black, especially on the left side and on the top right corner of the image, which is not really what was seen on the colormap. The colormap has the most excitation occurring on the right side and none in the top middle, where the inside of the Y is excited.

Index20

Daisy

In [107]:
img_tensor = preprocess_image_for_VGG(DAISY)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(DAISY, "Our selected image")
show_image(selected_layer_activation[0, :, :, 20], title='Activation vizualization from above', cmap='viridis')
In [102]:
img_tensor = preprocess_image_for_VGG(KELLOGG)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(KELLOGG, "Our selected image")
show_image(selected_layer_activation[0, :, :, 20], title='Activation vizualization from above', cmap='viridis')

Yellow Y

In [101]:
img_tensor = preprocess_image_for_VGG(YELLOW)
activations = activation_model.predict(img_tensor)
selected_layer_activation = activations[8]

show_image(YELLOW, "Our selected image")
show_image(selected_layer_activation[0, :, :, 20], title='Activation vizualization from above', cmap='viridis')

What was said for the DOG_FIRE_PIZZA image: This excitation is quite unexpected because the exciter image has the excitations going along the top of the image and down the left side of the image. For the DOG_FIRE_PIZZA image this filter is excited by stark changes from light to dark.

The image that creates the most excitation for this filter is the Daisy. The analysis for what excites this filter stays mostly true. We see excitation occurring at the top of the image, simiarly to the the exciter image. There is excitation around the petals, the edge of the flower, and some background where the dark is near the light. The colormap is interesting because it has excitation occurring mostly on the right side, where the exciter image had the excitation occurring mostly on the left side. However, both the DOG_FIRE_PIZZA image and the DAISY image have excitation in the middle and on the right side.

Visualization with Six Strongest Input Filters--Same as above, but included again for comparison purposes

In [113]:
# Just look at the weights in the current layer
block3_conv1_weights = block3_conv1.get_weights()[0] # 0 is the weights, 1 is the bias
block3_conv1_filter48_weights = block3_conv1_weights[:, :, :, 48]
print(block3_conv1_filter48_weights.shape)
(3, 3, 256)
In [114]:
# Normalize the Filter weights (from https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/)
filter48_min, filter48_max = block3_conv1_filter48_weights.min(), block3_conv1_filter48_weights.max()
block3_conv1_filter48_weights = (block3_conv1_filter48_weights - filter48_min) / (filter48_max - filter48_min)
print("Minimum: {}".format(block3_conv1_filter48_weights.min()))
print("Maximum: {}".format(block3_conv1_filter48_weights.max()))
Minimum: 0.0
Maximum: 1.0
In [115]:
def show_filter(colormap):
    # Vizualization (from https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks/)
    for i in range(len(top_indexes)):
        f = block3_conv1_filter48_weights[:, :, top_indexes[i]]
        ax= plt.subplot(3, 3, i+1)
        ax.set_xticks([])
        ax.set_yticks([])
        plt.title("Index: " + str(int(top_indexes[i])))
        plt.imshow(f[:, :],interpolation="nearest",cmap=colormap)
    plt.tight_layout()
    plt.show()
In [116]:
show_filter("Blues")

Explain how this circuit works

The colormap above shows a pattern of excitatory weights on the edges, particularly the right side. Due to the fact that the filter is very abstract, it is not trivial to identify how the most exciting filters in the prior layer contribute to its formation. Nevertheless, here are our two cents. We identify that filter 32 and filter 182 might have the task to recognize spots. Filter 139 might have the goal to detect high frequency patterns, combined with filter 20 which is an edge dectector, we hypothesize that the resulting filter detects high frequency circles. This is hypothesized and illustrated in Part 2. The colormap in Part 2 is more refined and shows more of an "L shape" in its excitatory weights. This is seen in the ability to pick up circles better than the filters in Part 4.

Define the properties of this circuit

In this layer there is pose-invariant properties being displayed. It is not looking for a specific thing (such as a dog), but it is instead in identifying patterns. It informs us that the patterns it picks up are high frequency circles. In this layer the network has not picked up particular semantic relations, but more abstract patterns so we cannot say that this circuit is polysemantic in nature. The feature of high-low frequency detector can be seen in the OpenAI Microscope previous layer activations in Part 3. Most of the filters detect a low frequency concentration in the center and detect higher frequencies at the edges. The detector is looking for the same thing in the images (changes in frequency and circles) but in different orientations. This detector has shown that in each filter it is finding the boundaries of objects.